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Abstract 

In this work, we introduce an algorithm to compute the derivatives of physical observables along the 
constrained subspace when flexible constraints are imposed on the system (i.e., constraints in which the 
constrained coordinates are fixed to configuration-dependent values). The presented scheme is exact, it 
does not contain any tunable parameter, and it only requires the calculation and inversion of a sub-block 
of the Hessian matrix of second derivatives of the function through which the constraints are defined. 
We also present a practical application to the case in which the sought observables are the Euclidean 
coordinates of complex molecular systems, and the function whose minimization defines the flexible 
constraints is the potential energy. Finally, and in order to validate the method, which, as far as we 
are aware, is the first of its kind in the literature, we compare it to the natural and straightforward 
finite-differences approach in a toy system and in three molecules of biological relevance: methanol, 
N-methyl-acetamide and a tri-glycine peptide. 

Author Summary 
Introduction 

In the theoretical and computational modeling of physical systems, including but not limited to condensed- 
matter materials [Rapaport(2004)], fluids [Allen and Tildesley(2005)], and biological molecules [Frenkel 
and Smit(2002)], it is very common to appeal to the concept of constraints. When a given quantity 
related to the system under study is constrained, it is not allowed to depend explicitly on time (or on 
any other parameter that describes the evolution of the system in the problem at hand). Instead, a 
constrained quantity is either set to a constant value (hard or rigid constraints) or to a function of the 
rest of degrees of freedom {flexible, elastic or soft constraints); in such a way that, if it depends on time, 
it does so through the latter and not in an explicit manner. 

The imposition of constraints is useful in a wide variety of contexts in the fields of computational 
physics and chemistry: For example, we can use constraints to maintain an exact symmetry of the equa- 
tions of motion; like in Car-Parrinello molecular dynamics (MD) [Car and Parrinello(1985)], where the 
time-dependent Kohn-Sham orbitals need to be orthonormal along the time evolution of the quantum- 
classical system, a requirement that can be fulfilled by imposing constraints over their scalar product [Hut- 
ter and Curiom(2005)]. In a different context, we can use constraints, as in the Blue Moon Ensemble 
technique [Carter ct al.(1989)Carter, Ciccotti, Hynes, and Kapral], to fix some macroscopic, represen- 
tative degrees of freedom of molecular systems (normally called reaction coordinates), in order to be 
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able to compute free energy profiles along them that would take an unfeasibly long time if we used an 
unconstrained simulation. Probably the most common application of the idea of constraints, and the one 
that will be mainly discussed in this work, appears when we fix the fastest, hardest degrees of freedom 
of molecular systems, such as bond lengths or bond angles, in order to allow for larger time-steps in MD 
simulations [Schlick et al.(1997)Schlick, Barth, and Mandziuk, Ryckaert ct al.(1977)Ryckaert, Ciccotti, 
and Bcrcndscn]. 

In any of these cases (assuming that the dimensions of the spaces involved are all finite) the imposition 
of constraints can be described in the following way: If the state of the system is parameterized by 
a given set of coordinates q := ((7^)^=1, spanning the whole space, W, and the associated momenta 
p := {Pfi)^=i, a given constrained subspace, /C, of dimension K < N, can be defined by giving a set of 
L := N — K independent relations among the coordinates (In this work, we will only deal with holonomic, 
scleronomous constraints, i.e., those that are independent both of the momenta and (explicitly) of time.): 

h\q)^0, I = K + l,...,N. (1) 

The condition of these constraints being independent amounts to asking the set of L vectors of N 
components 

dh'iq):^ i^Q^iq)) I = K + 1, . . . , N , (2) 

to be linearly independent at the relevant points q satisfying (1), and it means that /C is a manifold 
of constant dimension in these points, which are called regular. Moreover, this independence condi- 
tion allows, in the vicinity of each point q and by virtue of the Implicit Function Theorem [Dubrovin 
et al.(1992)Dubrovin, Fomenko, and Novikov,Weisstein(last accessed on 03/17/09)], to (formally) solve (1) 
for L of the coordinates, which we arbitrarily place at the end of q, splitting the original set as q = (u, d), 
with u := (u'')^i and d :— {d^)f^K+i- Then, in the vicinity of each point q satisfying (1), we can express 
the relations defining the constrained subspace, JC, parametrically by 

d' = f{u), I = K + 1,...,N. (3) 

where the functions f{u) := (/^(w))^;^^]^ ^'I'C the ones whose existence the Implicit Function Theorem 
guarantees. The coordinates u are thus termed unconstrained and they parameterize /C, whereas the 
coordinates d are called constrained and their value is determined at each point of K. according to (3). 
In general, the functions will depend on u, and the constraints will be said to be flexible [Echenique 
et al.( 20 11) Echenique, Cavasotto, and Garcia- Risuciio] . In the particular case in which all the functions 
are constant along /C, the constraints are called hard, and all the calculations are considerably simplified. 
In this work, we tackle the general, more involved, flexible case. 

Of course, even if IC is regular in all of its points, the particular coordinates d^ that can be solved need 
not be the same along the whole space. One of the simplest examples of this being the circle in , which 
is given by f{x, y) := x'^ — E? = 0, an implicit expression whose gradient is non-zero for all (a;, y) G K. 
However, if we try to solve, say, for y in the whole space /C, we will run into trouble at y = 0; if we try to 
solve for x, we will find it to be impossible at a: = 0. I.e., the Implicit Function Theorem does guarantee 
that we can solve for some of the original coordinates at each regular point of /C, but sometimes the 
solved coordinate has to be x and sometimes it has to be y. Nevertheless, we will assume this to be the 
case throughout this work, as is normally done in the literature [Echenique et al.(2006)Echenique, Calvo, 
and Alonso, Christen and van Gunstcren(2005),Hess et al.(2002)IIcss, Saint-Martin, and Bcrcndscn, Zhou 
et al.(2000)Zhou, Reich, and Brooks], and thus we will consider that K, is parameterized by the same 
subset of coordinates u for all of its points. 

It is also worth mentioning at this point that, not only from the physical point of view all the con- 
straints dealt with in this work are just holonomic constraints, but also the wording used to refer to 
the two flexible and hard sub-types is multiple in the literature. The first sub-type is called flexible in 
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refs. [Christen et al. (2007) Christen, Christ, and van Gunsteren, Christen and van Gunsteren(2005), Hess 
et al.( 2002) Hess, Saint-Martin, and Berendsen, Zhou et al. (2000) Zhou, Reich, and Brooks], elastic in [Cot- 
ter and Reich(2004)], and soft in [Zhou et al. (2000) Zhou, Reich, and Brooks]; whereas the second sub- type 
is called hard in refs. [Christen et al. (2007) Christen, Christ, and van Gunsteren, Zhou et al.(2000)Zhou, 
Reich, and Brooks], just constrained in [Christen and van Gunsteren(2005)], or holonomic in [Cotter and 
Reich(2004)], rigid in [Hess et al.(2002)Hess, Saint-Martin, and Berendsen, Zhou ct al. (2000) Zhou, Reich, 
and Brooks], and fully constrained in [Zhou et al. (2000) Zhou, Reich, and Brooks]. Some of these terms 
are clearly misleading [elastic^ holonomic or fully constrained)^ and, in any case, so many names for such 
simple concepts is detrimental to understanding in the field. 

The situation is further complicated by the fact that, when studying the statistical mechanics of 
constrained systems, one can think about two different models for calculating the equilibrium proba- 
bility density, whose names often collide with the ones used for defining the type of constraints ap- 
plied. On the one hand, one can implement the constraints by the use of very steep potentials around 
the constrained subspace; a model sometimes called flexible [Helfand(1979), Pechukas(1980)], some- 
times called stiff [Echcniquc et al.(2006)Echcnique, Calvo, and Alonso,Van Kampen and Lodder(1984)]. 
On the other hand, one can assume the D'Alembert principle [Goldstein ct al. (2002) Goldstein, Poole, 
and Safko] and hypothesize that the forces are just the ones needed for the system to never leave 
the constrained subspace during its dynamical evolution; a model normally called rigid [Echenique 
et al.(2006)Echenique, Calvo, and Alonso, Helfand(1979), Pechukas(1980)]. The two statistical mechan- 
ics models have long been recognized to present different equilibrium probability distributions [Go and 
Sclicraga(1976), Helfand(1979), Pechukas(1980), Van Kampen and Lodder(1984)], and this is the major 
concern in the literature when discussing them. In refs. [Echenique et al.( 20 11) Echenique, Cavasotto, and 
Garci'a-Risueho, Echenique ct al.(2006)Echenique, Calvo, and Alonso], the reader can find a very detailed 
discussion of this issue, which we only touch here briefly for completeness. 

It is worth remarking that the two types of constraints and the two types of statistical mechanics 
models can be independently combined; one can have either the stiff or the rigid model, with either 
flexible or hard constraints, hence making any interference between the two sets of words undesirable. The 
wording chosen is this work is, on the one hand, fairly common, and on the other hand, non-misleading. 

Now, if we take any physical observable X{q), depending only on the coordinates (not on the mo- 
menta), and originally defined on the whole space, W, its restriction to /C is given by 

Z{u):^X{uJ{u)) , (4) 

where the symbol has been deliberately changed in order to indicate that Z and X are different functions. 
The derivatives of this observable along K. are thus 

^{u) = ^(uj{u)) + ^(u,f{u))^iu) , (5) 

where we have assumed the convention that repeated indices (like / above) indicate a sum over the 
relevant range, and we have omitted (as we will often do) the range of variation of the index r. 

In the case of hard constraints, i.e., when the functions are all constant numbers dpi the above 
expression reduces to 

dZ , , dX , ^ ^ 

O^M-d^rM)^ (6) 

where X{q) must be a known function of q (in order to have a well-defined problem), and its derivative 
is typically easy to compute. However, if the constraints are of the more general, flexible form (the ones 
tackled in this work), the calculation of the partial derivatives {df^ /du^){u) cannot be avoided. 

If the constraints are assumed to be flexible, it is common in the literature of molecular modeling to de- 
fine these functions (u) as the values taken by the coordinates d if we minimize either the total or the po- 
tential energy with respect to all d^ at fixed u [Echenique et al.(2006)Echenique, Calvo, and Alonso, Chris- 
ten and van Gunsteren(2005),Hess et al.(2002)Hess, Saint-Martin, and Berendsen, Zhou et al.(2000)Zhou, 
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Reich, and Brooks, Echenique et al.(2011)Echcnique, Cavasotto, and Garcia- Risueno] . Since the energy 
functions used in molecular simulation are typically rather complicated, such as the ones in classical 
force fields, with a large number of distinct functional terms [Brooks ct al.(2009)Brooks, Brooks III, 
MacKerell, Nilsson, Pctrella, Roux, Won, Archontis, Bartcls, Boresch, Caflisch, Caves, Cui, Dinner, Feig, 
Fischer, Gao, Hodoscck, Im, Kuczera, Lazaridis, Ma, Ovchinnikov, Paci, Pastor, Post, Pu, Schaefer, 
Tidor, Venable, Woodcock, Wu, Yang, York, and Karplus, Jorgcnsen and Tirado-Rivcs(1988), Jorgensen 
et al. (1996) Jorgensen, Maxwell, and Tir ado-Rives, Ponder and Case(2003),Case et al.(2008)Case, Darden, 
Cheatham III, Simmerling, Wang, Duke, Luo, Crowley, Walker, Zhang, Merz, Wang, Hayik, Roitberg, 
Seabra, Kolossvary, Wong, Paesani, Vanicek, Wu, Brozell, Steinbrecher, Gohlke, Yang, Tan, Mongan, 
Hornak, Cui, Mathews, Seetin, Sagui, Babin, and Kollman, Pearlman et al.(1995)Pearlman, Case, Cald- 
well, Ross, Cheatham III, DeBolt, Ferguson, Seibel, and Kollman], or the effective nuclear potential 
arising from the solution of the electronic Schrodinger equation in the ground-state Born-Oppenheimer 
approximation [Echenique et al.(2006)Echenique, Calvo, and Alonso, Echenique and Alonso(2007)], the 
minimization of the energy with respect to the coordinates d has to be performed numerically. Hence, 
the functions (u), which are the output of this process, do not have a compact analytical expression 
that can be easily differentiated to include it in eq. (5) (this is even the case in very simple toy systems; 
see the Results and Discussion section). 

In this work, we present a parameter-free, exact algorithm (up to machine precision) to calculate 
the derivatives {df^ /du^){u) in such a case. Although several methods exist in the literature [Christen 
and van Gunsteren(2005), Hess ct al.(2002)Hess, Saint-Martin, and Berendsen, Zhou et al.(2000)Zhou, 
Reich, and Brooks] for performing MD simulations with flexible constraints, nobody has dealt, as far as 
we are aware, with the computation of these derivatives. Since the general idea can be applied to any 
situation in which (1) we have flexible constraints, (2) that are defined in terms of the minimization of 
some quantity with respect to the constrained coordinates, we first introduce, the essential part of the 
algorithm based on these two points. Then, we develop a more sophisticated application of this idea to 
the calculation of the derivatives along the constrained subspace of the Euclidean coordinates of molecular 
systems; a problem that we faced in our group when trying to calculate the correcting terms associated 
to mass-metric tensor determinants that appear in the equilibrium probability density when constraints 
are imposed [Echenique et al.(2006)Echenique, Calvo, and Alonso, Echenique and Calvo(2006), Echenique 
et al.(2011)Echenique, Cavasotto, and Garcia-Risueiio] . Finally, we perform a comparison between the 
results obtained with our exact algorithm and the calculation of the derivatives by finite differences; this 
serves the double purpose of numerically validating the algorithm and showing the limitations of the 
latter method, which needs the tuning of a parameter for each particular problem. 

Methods 
General Algorithm 

As we mentioned in the Introduction, we assume that we are dealing with a constrained problem in 
which the functions f^{u) in eq. (3) are defined as taking the values of the constrained coordinates 
that minimize a given function, V{q) = V{u, d), for each fixed u, i.e., 

V{uJiu))<Viu,d) ,yd£V{f{u)) , (7) 

where 'D(^f{u)^ is a suitable open set in containing the point f{u). Depending on the particular 
application, one can ask the minimum that defines the functions f^{u) to be global or just local. However, 
in the cases in which V is the total or the potential energy of a complex molecular system, it may become 
very difficult to find its global minimum (due to the shear number of dimensions of the search space), 
and the local choice is the only reasonable one [Echenique et al.(2011)Echcniquc, Cavasotto, and Garcia- 
Risuchol . 
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In order to calculate the derivatives along K. of any physical observable function of the coordinates 
Z(u) := X (u, f{u)'j , like the one defined in (4), we can always follow the finite-differences approach. 
However, as we discuss in the Results and Discussion section, finite differences presents intrinsic inac- 
curacies which are difficult to overcome, specially as the system grows larger. Let us now introduce a 
different way to calculate {dZ / du^){u) which does not suffer from this drawback. 

The starting point is eq. (5) in the Introduction, which we copy here for the comfort of the reader: 

^[u) = ^{uj{u)) + ^iu,f{u))^{u) . (8) 

As we mentioned, the expression of X{q), as well as the functions (u), must be known if we wish 
to have a well-defined constrained problem to begin with. Therefore, the only objects that remain to be 
computed are the partial derivatives {df^ /du^){u). 

If we assume that we have available some method to check that the order of the stationary point is 
the appropriate one (i.e., that it is a minimum, and not a maximum or a saddle point), we can write a 
set of equations which are equivalent to eq. (7), and which (implicitly) define the functions f^{u): 

dV 

— {uj{u))=0, I = K + 1,...,N. (9) 
Now, we can take the derivative of this expression with respect to a given unconstrained coordinate 

V ' V V ' 

Hri{u) Hji{u) F;'{u) 

where iJ^,y(M), with — 1, . . . ,N, is the Hessian matrix of V evaluated at (m, /(m)) G JC, and [u) 
is the matrix of unknowns that we want to solve for. In the whole document, we adhere to the practice 
of using different types of indices in order to indicate different ranges of variation. Here, for example, 
fj,,v, p, . . . run from 1 to N; r,s,t, . . . run from 1 to K] and /, J, M, . . . run from A' -|- 1 to TV. In the next 
section, we need to use more types of indices, but the idea is the same. 

It is worth mentioning that similar equations to the ones above can be found in classical mechanics 
anytime that local coordinates are used (the coordinates u in this work). For example, the force in such 
a case is defined as dV/du^ and the chain rule can be used in a similar way to what we do here. Note, 
however, that eq. (9) does not contain derivatives with respect to vT , but to the constrained coordinate 
. This makes the approach slightly different and, indeed, eq. (10) would become trivial in the most 
common hard situation tackled in the literature, where df^ jdu!' ~ 0, VJ, r. 

Since we are, by hypothesis, in a minimum of V with respect to the constrained coordinates d, the 
constrained sub-block Hji(u) of the Hessian is a positive definite matrix, and therefore invertible. Hence, 
if we multiply eq. (10) by its inverse, denoted by H^^'^ [u), sum over /, exploit the fact that iJ^j^(it) and 
H^^'^iu) are symmetric, and conveniently rename the indices, we arrive at: 

|^(u) =: Fl{u) = -H"{u)H.M , (11) 

which, as promised, allows us to find the exact derivatives {df^ /du^){u) with the only knowledge of the 
Hessian of V at the point [u,f{u)), and, upon introduction of the result in eq. (8), also the derivatives 
along the constrained subspace IC of any physical observable X{q). 

As mentioned, several methods exist in the literature [Christen and van Gunsteren(2005),Hcss et al.(2002)Hess, 
Saint-Martin, and Berendsen, Zhou et al.(2000)Zhou, Reich, and Brooks] to perform MD simulations 
with flexible constraints, however, none of them has tackled the calculation of these derivatives, which 
are very basic objects presumably to be needed in many future applications (see, e.g., refs. [Echenique 
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et al.(2006)Echemque, Calvo, and Alonso, Eclienique et al.(2011)Echenique, Cavasotto, and Garci'a- 
Risucno]). Of course, it is always possible to compute derivatives using the simple and straightforward 
method of finite differences. In this work, we use finite differences as a way of validating the new, exact 
method and ensuring it is error free. 

The accuracy of the new algorithm is only limited by the accuracy with which we can calculate the 
Hessian of V at (u, /(w)) and invert it; there is no tunable parameter that we need to adjust for optimal 
accuracy, as in the case of finite differences (see below and also Results and Discussion) . This makes a 
difference because, in classical force fields [Ponder and Case(2003)] and even in some quantum chemical 
methods (e.g., see chap. 10 of [Jcnscn(I998)]), the Hessian can be calculated analytically, without the 
need of finite differences. 

Although no optimization of the numerical cost has been pursued in this work, some remarks can be 
made about it, in comparison with the cost of the finite-differences approach. In order to calculate the 
partial derivatives dZ/du'^ with respect to the unconstrained coordinates u using finite differences, we 
need to: 

1. Minimize V{u^d) at fixed u to find f{u) (this step is common with the new method introduced 
here). 

2. Calculate Z{u) :— X (u, f{u)) (this step is common with the new method introduced here). 

3. Choose a displacement A and minimize V{u,d) at the point u := {u^)^^i, where ■u'' = u'' + A if 
r = s and = ii r s. This yields f{u) at a nearby point in K. with displaced a quantity 
A and the rest of unconstrained coordinates kept the same. 

4. Calculate Z{u) := X{u,f{u)). 

5. Calculate 

AZ _ Zjii) - Zju) 

Au^ A ^ ' 

as the finite-difference approximation to the sought derivative dZ/du^ at the point u. 

Note that the third point of this finite-differences approach is essentially a linear stability analysis. 
When strongly non-equilibrium points are present, such as in the examples discussed in the last section, 
this approximation fails and the fact that the new algorithm introduced in this work uses only quantities 
defined at the point u becomes even more important. 

Now, assuming that we have a good enough guess for the parameter A, the cost of this procedure 
is dominated by the need to perform K minimizations of the function V ^ one in each of the directions 
corresponding to the unconstrained coordinates . If we denote by A'jt the average number of iterations 
needed for these minimizations to converge, and we define Cy and Cdv as the numerical costs (in computer 
time) of computing V and its first derivatives with respect to the constrained coordinates d, respectively, 
we have that the average cost of calculating the sought derivatives dZ/ du^ using finite differences will be 
KNit{Cv + Cdv) for local optimization methods such as the steepest descent or the conjugate gradient, or 
KN[^Cv for Monte Carlo-based methods in which the derivatives of V are not needed, such as simulated 
annealing [Press et al.(2007)Press, Teukolsky, Vetterling, and Flannery]. 

On the other hand, the new algorithm does not require the extra minimizations, but it does require 
the calculation of the Hessian of V with respect to the internal coordinates (whose cost we call Ch), and 
the computation of the inverse of its constrained sub-block, Hjj, applied to each one of the K L- vectors 
Fr in eqs. (10) and (11), of cost CiH', resulting in a total cost of Ch + Kdn- 

The comparison between the two costs is not trivial and some remarks about it must be made: First, 
one must notice that the different individual costs involved, Cy, Cdv, Ch and CiH, are strongly dependent 
on the characteristics (1) of the coordinates q used and (2) of the function V{q). For example, if the 
coordinates q are the Euclidean ones and the function V{q) is the potential energy of a molecular system 
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as modeled by a typical force field [Brooks et al.(2009)Brooks, Brooks III, MacKerell, Nilsson, Petrella, 
Roux, Won, Archontis, Bartcls, Borcsch, Caflisch, Caves, Cui, Dinner, Fcig, Fischer, Gao, Hodoscck, Im, 
Kuczcra, Lazaridis, Ma, Ovchinnikov, Paci, Pastor, Post, Pu, Schaefcr, Tidor, Vcnablc, Woodcock, Wu, 
Yang, York, and Karplus, Jorgensen and Tirado-Rives(1988), Jorgenscn et al.(1996) Jorgensen, Maxwell, 
and Tirado-Rives, Ponder and Casc(2003), Case et al.(2008)Casc, Darden, Cheatham III, Simmerling, 
Wang, Duke, Luo, Crowley, Walker, Zhang, Mcrz, Wang, Hayik, Roitberg, Seabra, Kolossvary, Wong, 
Paesani, Vanicek, Wu, Brozell, Steinbrecher, Gohlke, Yang, Tan, Mongan, Hornak, Cui, Mathews, Seetin, 
Sagui, Babin, and Kollman, Pearlman et al.(1995)Pearlman, Case, Caldwell, Ross, Cheatham III, DeBolt, 
Ferguson, Seibel, and Kollman], the most direct algorithms for calculating V and its derivatives yield costs 
Cv, Cdv and Ch which arc of order N'^ , NK and A^^, respectively [Frenkel and Smit(2002)]. However, 
if more advanced long-range techniques are used, such as the particle-particle particle-mesh (PPPM) 
method [Eastwood and IIockncy(1974)], the fast multipole method [Greengard and Rokhlin(1987)] or 
the particle-mesh Ewald summation [Darden et al.(1993)Darden, York, and Pedersen], these costs can 
be reduced to order NlogN or even N (for large N and forgetting prefactors). Also, as mentioned, if 
the coordinates used are not the Euclidean ones but some internal coordinates such as the ones used 
in this work, these costs must change in order to account for the transformation between the two. If 
force fields are not used but, instead, V{q) is the ground-state Born-Oppenheimer energy as calculated 
using Hartree-Fock [Echenique and Alonso(2007)], then the most naive implementations yield costs for 
Cy, Cdv and Ch which are of order A^^ [Jensen(1998)]. The cost, Ci//, of calculating the inverse of Hji 
applied to a vector Fr can range from order A^ to order A^'^ depending on the sparsity of the matrix [Press 
et al.(2007)Press, Teukolsky, Vetterling, and Flannery], which, in turn, depends again on the coordinates 
used and on the structure of V{q). Finally, additional qualifications may complicate the comparison, such 
as the architecture of the computers in which the algorithms are implemented, parallelization issues, or 
the fact that, e.g., if we need the Hessian for a different purpose in our simulation, such as the calculation 
of the corresponding correcting term that appears both in the constrained stiff model and in the Fixman 
potential [Echenique et al.(2006)Echenique, Calvo, and Alonso], then the 'only' computational step we 
are adding is the inversion of a matrix. 

Despite the complexity and problem-dependence of the cost assessment, it must be stressed that, even 
in the cases in which the new algorithm turns out to be more expensive than the alternatives, the fact that 
it is exact and parameter-free might still make it the preferred choice in problems where high accuracy 
is needed. Although a parameter-free structure does not guarantee higher accuracy, in this case it does, 
since our method can be identified as the proper limit of the finite-differences scheme when A — > 0. This 
is illustrated in Results and Discussion. 

It is also worth remarking that the new method, as mentioned, is not needed to perform MD simula- 
tions, which can be run without calculating any of the derivatives tackled in this work [Christen and van 
Gunsteren(2005),Hess et al.(2002)Hess, Saint-Martin, and Bercndsen, Zhou et al. (2000) Zhou, Reich, and 
Brooks] . Our method is only needed when some observable in which these derivatives are included (such 
as the aforementioned mass-metric tensor determinants) needs to be computed. In such cases, the only 
two options to get to the final result are either finite differences or our method, and the most convenient 
of the two has to be chosen; even if its cost is a burden. 

Application to Euclidean Coordinates of Molecules 

In this section, we will apply the general algorithm introduced above to calculate the derivatives along the 
constrained subspace of the Euclidean coordinates of molecular systems in a frame of reference (FoR) fixed 
in the molecule. This problem has been faced by our group when trying to calculate the correcting terms 
associated with mass-metric tensor determinants that appear in the equilibrium probability density when 
flexible constraints are imposed [Echenique et al.(2006)Echenique, Calvo, and Alonso, Echenique and 
Calvo(2006), Echenique et al.(2011)Echenique, Cavasotto, and Garcia- Risucno] . More specifically, these 
derivatives are needed to calculate the determinant of the induced mass-metric tensor g that appears in 
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the constrained rigid model, according to the formulae derived in ref. [Echenique and Calvo(2006)]. 

In such a case, the system of interest is a set of n mass points termed atoms. The three Euclidean 
coordinates of atom a in a FoR fixed in the laboratory are denoted by Xq, and its mass by TOq., with 
a = l,...,n. However, when no explicit mention to the atom index needs to be made, we will use 
X := {x^)^^i to denote the iV-tuple of all the N := 3n Euclidean coordinates of the system. The 
masses TV-tuple, m := {'m^)^^i^ in such a case, is formed by consecutive groups of three identical masses, 
corresponding to each of the atoms. 

Apart from the Euclidean coordinates, one can also use a given set of curvilinear coordinates (also 
called sometimes general or generalized), denoted by q := (q'^)^^i, to describe the system. Both the 
coordinates x and q parameterize the whole space W, and the transformation between the two sets and 
its inverse are respectively given by 

= fi = l,...,N, (13a) 

q^'^Q^'{x), fi^l,...,N. (13b) 

We will additionally assume that, for the points of interest, this is a proper change of coordinates, 
i.e., that the Jacobian matrix 

has non-zero determinant. 

Now, we define a particular FoR fixed in the system to perform some of the calculations. To this 
end, we select three atoms (denoted by 1, 2 and 3) in such a way that o, the position in the FoR of 
the laboratory of the origin of the FoR fixed in the system, is the Euclidean position of atom 1 (i.e., 
o := xi). The orientation of the FoR {x' ,y' , z') fixed in the system is chosen such that atom 2 lies in the 
positive half of the 2; '-axis, and atom 3 is contained in the (a;', z')-plane, with projection on the positive 
half of the x'-axis (see fig. 1). The position of any given atom a in the new FoR fixed in the system is 
denoted by x'^. Also, let E{(f), 9, tp) be the Euler rotation matrix (in the ZYZ convention) that takes a free 
3-vector of primed components, a', to the FoR fixed in the laboratory, i.e., a — E{(l), 9, ip) a' [Goldstein 
et al. (2002) Goldstein, Poole, and Safko]. 

Although the aforementioned curvilinear coordinates q are a priori general, it is very common to 
take into account the fact that the typical potential energy functions of molecular systems in absence of 
external fields do not depend on := (oj,, Oy, o^) nor on the angles (0, 9, and to consequently choose 
a set of curvilinear coordinates split into q = (e,r), where the first six are these external coordinates, 
e := (6"^)^^;^ = {ox,Oy,Oz, 0, 9, tp). As we mentioned before, o := {ox,Oy, Oz) describes the overall position 
of the system with respect to the FoR fixed in the laboratory, and its overall orientation is specified by 
the angles {(f>,9,ifj). The remaining N — 6 coordinates r := {r"')a=7 are called internal coordinates and 
determine the positions of the atoms in the FoR fixed in the system [Echenique(2007), Echenique and 
Alonso(2006)]. They parameterize what we shall call the internal subspace or conformational space, 
denoted by I, and the coordinates e parameterize the external subspace, denoted by £; consequently 
splitting the whole space a,s W — £ x I (denoting by x the Cartesian product of sets) . 

The position, x'^, of any given atom a in the axes fixed in the system is a function, X^{r), of only 
the internal coordinates, r, and the transformation from the Euclidean coordinates x to the curvilinear 
coordinates q in (13a) may be written more explicitly as follows: 

x;a^^Xjq) = o + E{cP,9,^)X;,^{r) . (15) 

Although general constraints affecting all the coordinates q [like those in (1)] can be imposed on the 
system, the already mentioned property of invariance of the potential energy function under changes of 
the external coordinates, e, together with the fact that the potential energy can be regarded as 'producing' 
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the constraints [Eclieniquc ct al.(2006)Echenique, Calvo, and Alonso], make physically frequent the use 
of constraints involving only the internal coordinates, r: 

h\r)^0, I = K +1,...,N . (16) 

Under the common assumptions in the Introduction, these constraints allow us to split the internal 
coordinates as r = (s, d), where the first M:=K — 6 = N — L — 6 ones, s :— (s*)fl6+ii ^'^^ called 
unconstrained internal coordinates and parameterize the internal constrained subspace, denoted by E. 
The last L := N — K ones, d :— (rf^)/Li<-+ii correspond to the constrained coordinates in the Introduction 
and are called. The external coordinates, e, together with the unconstrained internal coordinates, s, 
constitute the set of all unconstrained coordinates of the system, u = (e, s), which parameterize the 
constrained subspace /C, being /C = £ x S. 

In this situation, the constraints in eq. (16) are equivalent to 

d'^fis), I = K + l,...,N, (17) 

and the functions f^(s) are defined as taking the values of the coordinates d^ that minimize the potential 
energy with respect to all d^ at fixed s [Echenique et al.(2006)Echenique, Calvo, and Alonso, Echenique 
et al.(2011)Echenique, Cavasotto, and Garcia- Risueno, Zhou et al.(2000)Zhou, Reich, and Brooks]. 

Finally, if these constraints are used, together with (15), the Euclidean position of any atom in the 
constrained case may be parameterized with the set of all unconstrained coordinates, u, as follows: 

Xa = Za{u) := Xa{e,s, f{s)) 
- o + i?(0,0,,A)^^(s,/(s)) 

=: o + i?(0,0,V)C(s) , (18) 

where the name of the transformation functions has been changed from X to Z, and from X' to Z', in 
order to emphasize that the dependence on the coordinates is different between the two cases. 

In order to calculate the derivatives along E of the primed atoms positions, Z^, with respect to the 
unconstrained internal coordinates s (needed, for example, in eq. (28) of ref. [Echenique and Calvo(2006)] 
to compute the determinant of the induced mass-metric tensor g), we first differentiate with respect to 
in Z^{s) := X^(s, f{s)) , arriving to the analogue to eq. (8): 

§(»./(.»+ if (../w)|;m^ 

Now, the derivatives (9/^/9s')(s) can be calculated using the general algorithm introduced in the 
previous section simply noticing that, in this case, V is precisely the potential energy of the system. 
Therefore, the only objects that remain to be computed are the derivatives dX^/ds^ and dX^/dd^ , 
which can be known analytically (they are geometrical [or kinematical] objects, i.e., they do not depend 
of the potential energy). We now turn to the derivation of an explicit algorithm for finding them and 
thus completing the calculation that is the objective of this section. 

In the supplementary material of ref. [Echenique and Calvo(2006)], we give a detailed and explicit 
way for expressing any 'primed' vector X^ as a function of all the internal coordinates, in the particular 
coordination scheme known as SASMIC [Echenique and Alonso(2006)]. We could take the final expression 
there [eq. (5)] and explicitly perform the partial derivatives, however, we shall follow a different approach 
that is both more straightforward and applicable to a larger family of Z-matrix-like schemes for defining 
internal coordinates. 

In non-redundant internal coordinates schemes, whether they are defined as in ref. [Echenique and 
Alonso(2006)] or not, each atom is commonly regarded as being incrementally added to the growing 
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molecule for its coordination. This means that the position of the {a > 3)-th atom in the body-fixed 
axes is uniquely specified by the values of three internal coordinates that are defined with respect to the 
positions of three other atoms with indices /3(a), S{a), 7(a) < a. This is a very convenient practice, and 
we will assume that we are dealing with a scheme that adheres to it. 

Normally, the first of the three internal coordinates used to position atom a is the length of the vector 
joining a and (3{a). Atom /3(a) is commonly chosen to be covalently attached to a and, then, the length 
of this vector is naturally termed bond length, and denoted by A given function /3(a) embodies the 
protocol used for defining this atom, /3(ck), to which each 'new' atom a is (mathematically) attached; a 
superindex, as in (5P(a), indicates composition of functions, and the iteration of such compositions allows 
us to trace a single-branched chain of atoms that takes from atom a to atom 1, at the origin of the 
'primed' axes. If Na is a number such that l3^°'{a) ~ 1, this chain is given by the following set: 

{a = /30(a), /3(a), /32(a), . . . , /3^"-3(a), 3, 2, 1} . (20) 

It is clear that, if we now change a given bond length 5^ associated to atom e, atom a will move if 
e € Ba] simply because atom e will move and a has been positioned in reference to atom e's position. 
Thus, if we define 

Rafi ■= Xp - , (21) 

for any a, /3, and accordingly denote by Rp(e)s the unitary vector in the 'primed' FoR that points from 
atom /3(e) to atom e, a change in the bond length associated to e from b^ to fe^ + db (while keeping the 
rest of the internal coordinates constant) will translate all atoms a such that e € Ba a distance db along 
Rl}{E)e, having 

X^(6e + db) = X^{b,) + , (22) 

and hence 

K{be+dh)-X^{b,) ^ 

The second internal coordinate, after 6^, that is typically defined to position atom a with respect to 
the 'already positioned' part of the molecule is a so-called bond angle 9a. To define this angle, we need an 
additional atom associated with a, which we could denote by (5(a). Although one can in principle think 
of the possibility of using different atoms f3g{a) and Sg{a) to define the bond angle than the one used to 
define the bond length, the common practice in the literature is to use the same three atoms /3(a), d{a) 
and 7(a), to define the three internal coordinates associated to a. This is also the choice in the SASMIC 
scheme and the one in this work. The angle 6a is thus defined as 180° minus the angle formed between 
the vectors Rs{a)0{a) and Ri3(a)a (see fig. 2). 

Now, the reasoning is the same as in the case of the derivative with respect to b^: For every atom e > 2 
that is the 'tip' of the bond angle 6^, the changes in this angle (keeping the rest of internal coordinates 
constant) will move atom e and therefore all atoms a that contain atom e in the chain Ba that links 
them to atom 1. 

If we now look at fig. 2, we see that a change from 9^ to 0^ -I- d9 amounts to rotate all atoms a that 
contain e in their chain to the origin an angle d9 around the unitary vector 9^, which is defined by 

l^/3(£)e X Rs{e)l3(e)\ 

The result, v-cot, of rotating a vector v around the direction given by the unitary vector 9 an amount 6 
is given by the well-known Rodrigues' rotation formula [Belongie(last accessed on 08/12/2009), Goldstein 
et al. (2002) Goldstein, Poole, and Safko]: 

v,^^ = vcos9+ {9 X v) sm9 + 9{9 ■ v){l - cos 9) . (25) 
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However, notice that, in order to define a rotation, it is not enough to specify the angle 9 and the 
rotation axis 9, but one additionally needs to specify a fixed point (which can actually be any of the 
points in a fixed line in the direction of 9). Therefore, the above expression is only correct for either 'free' 
vectors v (i.e., those that are not associated to a given point in space), or for vectors v whose starting 
point lies in the aforementioned fixed line. 

The fixed point for the rotation we are interested in can be chosen to be /3(e) and, using eq. (25), we 
have that 

Rl3(e)a{0e + d9) = Rp(^^)a{Oe) CO&d9 

+ [4xi?^(,)„(^?e)]sinrf0 (26) 

+ ee[Oe-Rmo.{^e)\{i~cosde) . 

Then, keeping the terms up to first order in dO, we can easily compute the derivative: 

9Rp{e)a _n ^ n (07) 

QQ^ - X Kp^^)^ , (27) 
which, since a variation of 9^ does not move atom /3(e) (i.e. dX '_ji^^^/d9^ = 0), allows us to conclude that 

^ = J; {^^(•^^ + ^^(^)") ^ = ' ^^^^ 

if e G Ba- 

The third and last internal coordinate that is usually defined to position atom a is a so-called dihedral 
angle tpa. To define this angle, we need a third atom associated with a, which we could denote by 7(a). 
The angle (pa is thus defined as the oriented angle formed between the plane containing atoms /3(a), 
(5(a) and 7(a) and the plane containing atoms a, /3(a) and 5(a). The positive sense of ipa is the one 
indicated in fig. 3, and, although it is common to find two different covalent arrangements of the four 
atoms a, /3(a), (5(a) and 7(a), termed principal and phase dihedral angles, respectively [Echenique and 
Alonso(2006)], this does not affect the mathematical definition of (pa given in this paragraph, nor the 
subsequent calculations. 

Regarding the derivative of the 'primed' position of atom a with respect to a given the only 
difference with the bond angle case is that, now, the rotation is performed around the direction given 
by the unitary vector Rs(e)i3(e) (see fig. 3). The fixed point can be again chosen as /3(e), and eq. (27) 
(changing 9^ by ipe and 9^ by Rs{e)i3{s))7 &s well as the fact that changes in ip^ do not move atom /3(e), 
still hold. Therefore, 

dX ' - -> 

-g^ = Rs(e)P{e) X i?/3(e)a , (29) 

if e e Ba- 
in order to decide whether or not atom a will move upon changes in internal coordinates associated 
to atoms e that do not belong to Ba we must first finish the story about internal coordinates definition. 
Since the argument above to show that a moved when e G Ba was that e itself moved and it was used 
to position a, we must ask 

1. whether or not there can be atoms that are also used to position a but that do not belong to Ba, 
and 



2. what happens when we change the internal coordinates associated to them. 
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The answers to these two questions depend on the particular scheme used to define the internal 
coordinates, and we will tackle them referring to the SASMIC scheme [Echcniquc and Alonso(2006)], 
which is the one used in this work: According to the SASMIC rules, there are only two situations in 
which an atom e ^ Ba can be used to position atom a, and they are depicted in fig. 4. 

The first case, in fig. 4a, attains only the first atoms of the molecule. Typically, atom 1 is not a first- 
row atom, but a hydrogen (such is the case of the three molecules studied, for example, in Results and 
Discussion). Hence, after positioning atoms 2 and 3, which are typically first-row, it is more representative 
to choose atom 3 as S{a) and atom 1 as 7(a) when positioning the rest of the atoms a attached to atom 
2. This makes 1 = f3'^{c() 7^ S{a) and hence 6{a) qualifies as an atom that is used to position a but which 
is not included in the chain Ba- 

The second case, in fig. 4b, corresponds to the situation in which the molecule divides in two branches, 
and it can happen all along its chemical structure. If atom e is the atom that defines the only principal 
dihedral over the bond connecting (5(e) and /3(e) (in the SASMIC scheme, only one principal dihedral 
can be defined on a given bond [Echcniquc and Alonso(2006)]), and atom a belongs to a different branch 
than the one beginning in e (the branches are indicated with grey broad arrows), then the starting atom 
e' of the branch to which a belongs (e' can be a itself) must be positioned using a phase dihedral in 
which e = 7(e'). Thus, s is an atom that is used to position a, but which does not belong to the chain 
Ba connecting a to atom 1. 

In principle, any change in the internal coordinates of atom 6{a), in the first case, or in those of atom 
e, in the second case, may move atom a, however, due to the geometrical characteristics of the internal 
coordinates, this is not the case. 

For example, it is easy to see that, in the case depicted in fig. 4a, a variation of the bond length be 
(denoting e := S{a)) does not move atom a. Regarding the angles, the dihedral ip^ is not defined because 
e = 3, and a change in 9^ can be seen to rotate atom a with fixed point /3(e) = f3{(x) and around the axis 
given exactly by 9^ as defined in eq. (24). (It is not trivial to see that this motion keeps all the rest of 
internal coordinates constant, specially the phase dihedral ip^- The authors found it helpful to imagine 
that atoms 1, 2 and 3 lie in the plane of the paper, with 9^ and a coming out of it towards the reader; 
the first orthogonally and the second not.) Therefore, the derivative of the Euclidean position of atom a 
with respect to 9^ is also given by eq. (28) in this special case. 

In the situation shown in fig. 4b, one can see that neither a change in nor in 9^ move atom e' nor a. 
However, if we change ip^, we need to move atom e' if we want to keep ip^' constant. Therefore, atom a 
moves in such a case and it does so by rotating with the same fixed point /3(e) and the same axis Rs(e)i3(e) 
as in the simpler cases depicted in fig. 3. This means that, again, we can calculate the sought derivative 
using the already justified eq. (29). 

In summary, only changes in bond lengths associated to atoms e E Ba affect the position of atom a: 



changes both in bond angles associated to atoms e E Ba and to d{a) in fig. 4a affect the position of atom 
a: 



and changes both in dihedral angles associated to atoms e £ Ba and to those that define the principal 
dihedral at a branching point that leads to atom a (see fig. 4b) can affect the position of atom a: 




(30) 




(31) 



Rsie)P{e) X Rp(e)a if e € 



dpe 



or [^pe ppaL, /3(e) e Ba] 



(32) 







otherwise 
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Finally, the outline of the algorithm for calculating the sought derivatives 9X^(s, /(s)) /ds^ along the 
constrained subspace E is: 

1. Calculate the chain Ba that connects atom a with atom 1 and identify the special cases depicted 
in fig. 4. 

2. Calculate the derivatives df^ /ds^ by solving the system of linear equations in (11). 

3. Calculate the geometric derivatives dX^{r)/ds'^ and dX^{r)/dd^, for / = K + 1, . . . , N, using 
eqs. (30), (31) and (32). 

4. Plug all the calculated quantities into eq. (19) et voila. 

Results and Discussion 

In this section, we compare the finite-differences approach (see Methods) to the new algorithm introduced 
in this work with two objectives in mind: the validation of the new scheme, and the identification of the 
most important pitfalls of the finite-differences technique, which are absent in the new method. It is 
worth stressing again that the method presented here is the first of its kind, as far as we are aware, and 
the finite-differences scheme is just a very natural and straightforward method that is always available 
when derivatives need to be calculated. In fact, the pitfalls of finite differences which we highlight in 
this section are very well known, although they have been seldomly presented in the context of molecular 
force fields. We hope that this section can be additionally useful to revisit this classical topic from a new 
angle. 

To these two ends, we have applied the more specific algorithm introduced in the previous section for 
the calculation of the derivatives of the Euclidean coordinates of molecular systems to the three biological 
species in fig. 5: methanol, N-methyl-acetamide (abbreviated NMA), and the tripeptide N-acetyl-glycyl- 
glycyl-glycyl- amide (abbreviated GLY3). For each one of these molecules, a number of dihedral angles 
describing rotations around single bonds (and indicated with light-blue arrows in fig. 5) have been chosen 
as the unconstrained internal coordinates, s, spanning the corresponding constrained internal subspace 
S. The rest of internal coordinates d (bond lengths, bond angles, phase dihedrals, and principal dihedrals 
over non-single bonds) are flexibly constrained as described in the previous sections. The numeration of 
the atoms and the definition of the internal coordinates follow the SASMIC scheme, which is specially 
adapted to deal with constrained molecular systems [Echenique and Alonso(2006)]. 

For methanol and NMA, due to the small dimensionality of their constrained subspaces, the working 
sets of conformations have been generated by systematically scanning their unconstrained internal coor- 
dinates at finite steps. For methanol, we produced 19 conformations, in which the central dihedral, ipe, 
ranges from 0° to 180° in steps of 10°. Similarly, the systematic scanning of the unconstrained dihedrals 
in NMA produced a set of 588 conformations in which the first and last angles, (pQ and ipio, range from 
0° to 180°, and the central one, (ps, ranges from 0° to 330°, all in steps of 30°. For GLY3, and in view 
of the dimensionality of its constrained subspace, 1368 conformations were generated through a Monte 
Carlo with minimization procedure. 

At each one of these conformations, defined by the value of the unconstrained internal coordinates 
s, the constrained coordinates d were found by minimizing the potential energy V{s,d) at fixed s, thus 
enforcing the constraints d = f{s) described in Methods. Let us remark that this fixing of the coordinates 
s is just an algorithmic way of sampling the constrained subspace defined by the relations d = /(s), and 
it does not imply that the coordinates s are constrained; indeed, they could take any value in the set 
of conformations, whereas the constrained coordinates d are fixed by the aforementioned relations. The 
potential energy and force-field parameters were taken from the AMBER 96 parameterization [Cornell 
et al.(1995)Cornell, Cieplak, Bayly, Gould, Merz, Ferguson, Spellmeyer, Fox, Caldwell, and Kollman, 
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Kollman et al.(1997)Kollman, Dixon, Cornell, Fox, Chipot, and Pohorill], and local energy minimization 
with respect to the constrained coordinates was performed with Gaussian 03 [Frisch et al.(2007)Frisch, 
Trucks, Schlcgel, Scuseria, Robb, Chccseman, Montgomery, Vreven, Kudin, Burant, Millam, Iyengar, 
Tomasi, Barone, Mennucci, Cossi, Scalmani, Rega, Petersson, Nakatsuji, Hada, Ehara, Toyota, Fukuda, 
Hascgawa, Ishida, Nakajima, Honda, Kitao, Nakai, Klene, Li, Knox, Hratchian, Cross, Bakken, Adamo, 
Jaramillo, Gomperts, Stratmann, Yazyev, Austin, Cammi, Pomelli, Ochterski, Ayala, Morokuma, Voth, 
Salvador, Dannenberg, Zakrzewski, Dapprich, Daniels, Strain, Farkas, Malick, Rabuck, Raghavachari, 
Foresman, Ortiz, Cui, Baboul, Clifford, Cioslowski, Stefanov, Liu, Liashenko, Piskorz, Komaromi, Martin, 
Fox, Keith, Al-Laham, Peng, Nanayakkara, Challacombe, Gill, Johnson, Chen, Wong, Gonzalez, and 
Pople]. At the minimized points, the Euclidean coordinates, Z^{s) :— X^(^s, f{s)^ , of all atoms in the 
system-fixed axes defined in the Methods section were also computed. 

In order to find the partial derivatives dZ'^jds^ at the generated points by finite differences, we 
produced, for each conformation in the working sets, M — K ~ 6 additional conformations, each 
one with a single coordinate displaced to s* + A. After the re-minimization of the constrained 
coordinates at the new points, we were in possession of all the data needed to compute the esti- 
mate of the sought derivative in eq. (12) for all unconstrained coordinates. In order to assess the be- 
haviour and accuracy of the finite-differences approach, we performed these calculations for the values 
A = 0.01°, 0.05°, 0.1°, 0.5°, 1.0°, 5.0°, 10.0°. 

On the other hand, to calculate the derivatives dZ^/ds^ using the new scheme introduced in Methods, 
we do not need to perform any additional minimization, but we need to know the Hessian matrix of the 
second derivatives of V{s,d) with respect to the internal coordinates. The Hessian in internal coordi- 
nates was calculated with the Gaussian 03 package [Frisch et al.(2007)Frisch, Trucks, Schlegel, Scuseria, 
Robb, Cheeseman, Montgomery, Vreven, Kudin, Burant, Millam, Iyengar, Tomasi, Barone, Mennucci, 
Cossi, Scalmani, Rega, Petersson, Nakatsuji, Hada, Ehara, Toyota, Fukuda, Hascgawa, Ishida, Naka- 
jima, Honda, Kitao, Nakai, Klene, Li, Knox, Hratchian, Cross, Bakken, Adamo, Jaramillo, Gomperts, 
Stratmann, Yazyev, Austin, Cammi, Pomelli, Ochterski, Ayala, Morokuma, Voth, Salvador, Dannen- 
berg, Zakrzewski, Dapprich, Daniels, Strain, Farkas, Malick, Rabuck, Raghavachari, Foresman, Ortiz, 
Cui, Baboul, Clifford, Cioslowski, Stefanov, Liu, Liashenko, Piskorz, Komaromi, Martin, Fox, Keith, 
Al-Laham, Peng, Nanayakkara, Challacombe, Gill, Johnson, Chen, Wong, Gonzalez, and Pople]. 

In order to compare the two methods, we turn first to the smallest system: methanol. In fig. 6a, we 
can see the value of the derivative dx^/d^p(, of the x-coordinate (in the 'primed' axes, but we drop the 
prime from now on) of hydrogen number 5 (see fig. 5) with respect to the unconstrained dihedral angle 
ipQ that describes the rotation of the alcohol group with respect to the methyl one. We can see that the 
agreement between the new algorithm and the finite-differences approach is good but not perfect, and 
that the discrepancy between the two is larger for the smallest (0.01°) and largest (10.0°) values of A 
depicted in the graph. 

To track the source of this difference, we can take a look at eq. (8), which gives the derivative 
dZ^/ds^ as a function of simple, 'geometrical' terms, dX^/ds^ and dX^/dd^ , and the numerical deriva- 
tives df^ /ds\ Of course, the choice of one method or another does not affect the former, but only the 
latter. In the particular case of dx^/dipQ in fig. 6a, if we remove the terms that are zero according to the 
rules in eqs. (30), (31) and (32), eq. (8) becomes 



The numerical derivatives appearing in this expression that are related to the three constrained 
coordinates associated to atom 5 are shown in figs. 6b, 6c and 6d, respectively, where we can see that 
the discrepancy between the new algorithm and the finite-differences approach is more significant. For 



dx5 db2 dx5 dbs 8x5 863 
db2 dipQ dbs d(pe 803 dipa 
8x5 db^ 8x5 96*5 dip5 



(33) 
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the bond angle 65 in fig. 6b, we see that the derivative predicted by finite differences is close to zero 
for all values of (ySg and for all the tested As, while the behaviour given by the new algorithm is more 
rich and substantially different. This large discrepancy is produced by the fact that bond lengths are 
very stiff coordinates in the energy function that we have used here, together with the default precision 
of the floating point numbers provided by Gaussian 03 outputs. In table 1, we can see indeed that the 
last significant figure of bond length 65 only starts to change for A = 5.0°, which makes any algorithm 
based on finite differences very unreliable for this particular quantity if small values of A are used. The 
bond angles and dihedral angles, on the other hand, are somewhat less stiff than bond lengths, as it 
can also be seen in tab. 1. This makes their derivatives by finite differences more reliable, as one can 
observe in figs. 6c and 6d, where the discrepancy with the new method is apparent for small A, but 
becomes gradually smaller as we increase it. Of course, since, in the new method presented in this work, 
all quantities are computed at the non-displaced point A = 0°, the problem regarding the number of 
significant figures does not appear. It is also worth remarking that, in the case of finite differences, the 
point in which this issue will appear depends on the number of bits used to represent coordinates, but it 
will always appear for some small enough value of A. 

As we noticed in fig. 6a, also in the case of the constrained internal coordinates the difference between 
the two methods starts to grow again when A reaches 5.0° or 10.0°. This is easily understood if we 
think that only in the A — > limit the estimate in eq. (12) converges to the actual value of the partial 
derivative. In fact, as the complexity of the system increases, the error introduced at large A may come 
not only from continuous changes in the location of the constrained minima, but also, as fig. 7 suggests, 
it may occur that, at a certain value of A, the very identity of the minima is altered, thus introducing 
potentially larger errors. In fig. 7a, we can see that the derivative d(p22/d(pn in GLY3 presents an 
unusually large error at the conformation 1044. In fig. 7b, we see that the minimum-energy value of (P22, 
which is the dihedral angle associated to carbon 22, describing the rotation around a given peptide bond 
(see fig. 5c), presents an abrupt change when A reaches 10°. If we think that the energy landscape of 
GLY3 is indeed a complex and multidimensional one, it is not difficult to imagine that, as we change 
(piY, i.e., as we increase A, the energy landscape is so altered that some minima disappear, some other 
appear, and the energy ordering among them is changed. In such a case, the structures found by the 
minimization procedure will be rather different between, say, A = 0° and A = 10°, thus producing a 
large error in the derivatives calculated by finite differences. Again, the new algorithm, which only uses 
quantities calculated at A = 0°, does not suffer from this drawback. 

To sum up, the finite-differences method contains two sources of error which the new method does not 
present: one at small values of A, related to the finite precision of the floating point numbers representing 
the internal coordinates, and the other at larger values of A, stemming from the very definition of the 
partial derivative by finite differences, and aggravated by the complexity of the energy landscapes of large 
systems. If the derivatives are to be calculated using finite differences, an optimal value of A must be 
chosen in each case so that the possible error is minimized. However, already in the simple example of 
methanol, we saw that the derivatives of different observables, in the same system, may behave differently 
as we change A (compare the bond length derivative in fig. 6b with that of the angles in figs. 6c and 6d) . 
In fig. 8, we additionally see that the search for the optimal A may be further complicated by the fact that 
the behaviour found also depends (strongly) on the system studied, and, in the case of the derivatives of 
the Euclidean coordinates, on the position of the atom in the molecule. 

In fig. 8a, we have plotted the normalized average of the absolute value of the error in the derivatives 
of the Euclidean coordinates, (le^l), as a function of A for the three molecular systems studied. This 
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quantity is defined, for a given unconstrained coordinate s\ as 



|ez|)(A) := (34) 



A^c(3n- 6) 

m—1 /A— 1 * 



where tlie index m indicates tlie conformation in the working set, running from 1 to A'c, FD stands for 
'finite differences', NA for 'new algorithm', and (5f is a normahzing quantity for each coordinate chosen 

The graphics in fig. 8a of this quantity correspond to the unconstrained dihedral angles (pQ, (ps and ipn 
of methanol, NMA and GLY3, respectively (see fig. 5). We observe that the average error as a function 
of A presents significantly different behaviours in the three molecules, never being smaller than a 2%. 
Additionally, in fig. 8b, we show the same error but this time individualized to the z-coordinate of three 
different l^*-row atoms of NMA: C3, N6 and C8. Although the overall behaviour of the error is similar 
for the three atoms, its size is not. 

All in all, we see that the need to tune for the optimal A in the finite-differences approach not only 
produces unavoidable errors, but also it must be done in a per-system, per-observable basis, clearly 
complicating and limiting the use of this technique. The new algorithm, on the other hand, is only 
affected by the source of error related to the accuracy with which the Hessian matrix of the potential 
energy can be calculated and inverted; apart from this, which is a general drawback of any method 
implemented in a computer, its mathematical definition is 'exact', in the sense that it does not contain 
any tunable parameter, like A, that must be adjusted for optimal accuracy in each particular problem. 

Also, and more importantly (since the failure of finite differences was indeed predictable) the good 
coincidence between the newly introduced, somewhat more involved method and the straightforward 
finite-differences scheme for the smallest system and in some intermediate range of values of A allows us 
to regard the new scheme as validated and error-free. 

Finally, despite what we discussed in the Methods section, namely, that we have not pursued here 
the numerical optimization of the algorithm introduced, being our main interest to present the general 
theoretical concepts and to show that the new method is exact and reliable, we close this section with an 
example of a toy system to provide a clue that the new technique is at least feasible. Before introducing 
the toy system it is worth noting that the examples tackled in this section are just particular cases, 
but the technique can be used in different systems and with different potential energy functions. When 
looking at the computer costs presented below, the reader should bear in mind that they may be not very 
significant (due to the aforementioned lack of optimization) and not very relevant (due to the choice of 
a small toy system and a given potential energy function). Of course, if any production runs using the 
new algorithm are attempted, a thorough numerical optimization and assessment should be performed, 
which we deem to be a very important next step of our work. 

The toy system is a 2-dimensional one, with positions x and y, and the following potential energy (see 
fig. 9): 

V{x,y) ^ -K{2 + smx){y~smx)'^ +tanh{xy) . (36) 

If we take a large enough K, say K = 20, we see that the system will present a strong oscillatory 
motion in the y coordinate, around approximately y = sinx (but not exactly, since the term tanh{xy) 
slightly modifies the position of the minimum), and with harmonic constant approximately equal to 
K{2 + sinx). In the spirit of this work, since, due to energetic reasons, the value of y will scldomly move 
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far away from the value that minimizes V{x, y) for each x, denoted by f{x) and imphcitly defined by the 
following equation: 

dV X 

— (x, fix)) = K{2 + sin x) {fix) - sin x) + = , (37) 

cosh {xf{x)) 

we can kill this oscillatory motion by assuming that a flexible constraint y — fix) exists. In such a case, 
X plays the role of the whole set of unconstrained coordinates s in the general formalism, and y plays the 
role of the whole set of constrained coordinates d. 

Now, if we perform a 'molecular dynamics' of this system, then we may need at some point to 
compute the derivative with respect to x of some observable Xix, y) restricted to the constrained subspace 
Zix) := /(x)) (for example, we may need this to calculate mass-metric tensor corrections at each 
time step [Echenique ct al.(2006)Echcniquc, Calvo, and Alonso]). We can do so by using finite differences 
or the new technique introduced in this work. As we discussed in the Methods section, for both approaches 
we will need to perform a minimization of Vix, y) at each fixed x in order to find fix) and Zix) := 
X(^x, fix)), hence, being this step common, we will not consider it for the assessment of the differences 
in computational cost between the two methods. The additional computations that will decide which 
method is faster are: 

• For finite differences: Choose a displaced value of the unconstrained coordinate x + Ax, minimize 
Vix + Ax, y) with respect to y in order to find fix + Ax), as well as X(^x + Ax, fix + Ax)) , and 
finally calculate the finite-differences estimation of the sought derivative: 

dZ X{x + Ax,fix + Ax))-X{x,fix)) 

dx^''''^ Ax ■ ^^^^ 

• For the new method: Calculate the objects in eq. (11), perform the required inversion to find 
df/dx, calculate the objects in eq. (5), and finally find dZ/dx using this last expression. In this 
simple case, all the objects to be computed are: 

The second-order derivatives of the potential energy can be easily calculated: 

d'^V , , . , 2a;^ tanh(x-(/) , 

ix,y)^Ki2 + sinx) ^ , 40a 

oy' cosh (a;?;) 

d^V / . i\ 1 — 2a;vtanh(a;w) x 

=Xcosa; 2;-2 1 + sina; + [ ^ ^' , 40b 

oxdy cosh~(xy) 

and we can use them to find the derivative of fix) through eq. (11): 



df, . 



dxdy 

The particularization of eq. (5) to this simple case is 



[X 



/(^)) 



(41) 



-(.)^-(x,/(x)) + -(x,/(.))^(x). (42) 
In this section, for illustrative purposes, we have chosen a simple observable Xix, y): 



Xix,y) := \/a;2 H- y2 ^ 



(43) 
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i.e., the distance of the particle to the origin of coordinates. Hence, the remaining objects that we need 
to compute in order to apply the new technique to this problem are 

5X, 



^ -(x,y) = , , (44a) 

??(^:2/) = ^=^- (44b) 

We have calculated dZ/dx using both techniques for 11 different values of a; = —5.0, —4.0, . . . , 4.0, 5.0. 
This calculation has been performed in a desktop iMac with a 2.66 GHz Intel Core 2 Duo processor and 
4GB of 1067 MHz DD3 RAM memory, running MacOSX Snow Leopard. The same compilation-time 
optimizations have been used in the two cases, and the common times have been subtracted as indicated 
before. It is also worth remarking that we have used Brent's method [Press et al.(2007)Prcss, Teukolsky, 
Vetterling, and Flanncry] for minimizing V{x,y)^ and a different choice will change the comparison. In 
these conditions, the new technique has proved approximately one order of magnitude faster than finite 
differences, elapsing 0.13 /zs/point vs. 1.08 /xs/point. 

In summary, in this work, we have introduced a new, exact, parameter- free method for computing 
the derivatives of physical observables in systems with flexible constraints. The new algorithm has been 
numerically validated in small molecules against its most natural alternative, finite differences. In doing 
so, numerous pitfalls of the latter method have been demonstrated, all arising from the fact that it 
contains a tunable parameter that has to be optimally adjusted in each particular problem at hand. 
In a number of numerical experiments, we have shown that the finite-differences approach contains two 
unavoidable sources of error that are not present in the new method: On the one hand, the finite number 
of significant figures used to represent, in computers, the values of the optimized coordinates, together 
with the fact that these constrained coordinates are typically very stiff, make the changes in this quantities 
often unobservable or at least badly resolved, thus rendering the finite-differences derivatives unreliable 
for small values of the displacement parameter A. On the other hand, the very fact that finite-differences 
derivatives only converge to the true ones for A — >■ 0, complicated with the possibility that the energy 
landscapes of complex molecular systems may significantly change their structure when the unconstrained 
coordinates are displaced, introduce new errors as A increases. These two sources of errors combined make 
compulsory the search of an optimal value of A in each particular case, and also establish a minimum 
error below which is not possible to go, as it can be seen in fig. 8. Also, using a simple toy system, we 
have shown that the new technique can be faster than finite differences in certain situations. The new 
method introduced here, and it is already being successfully used in a number of works in progress in 
our group to compute the correcting terms appearing in the equilibrium probability distribution when 
flexible constraints are imposed on the system [Echcnique et al.(2011)Echenique, Gavasotto, and Garcfa- 
Risucno] . Moreover, given the almost ubiquitous occurrence of the concept of constraints all throughout 
the fields of computational physics and chemistry, it is expected that the method described in this work 
will find many applications in present and future problems. Some examples have been already mentioned 
in the introduction, notably the case of ground-state Born-Oppenheimer MD [Alonso et al.(2008)Alonso, 
Andrade, Echenique, Falceto, Prada-Gracia, and Rubio, Andrade et al.(2009)Andrade, Castro, Zueco, 
Alonso, Echenique, Falceto, and Rubio] (using, e.g., Hartree-Fock [Echenique and Alonso(2007)]), which 
can be regarded as a flexibly constrained problem in which the soft coordinates are the nuclear positions 
R, the hard ones are the electronic orbitals ip, and the function to be minimized is the expected value 
{'^\He{R)\'^) of the i?-dependent electronic Hamiltonian in the A^-electron Slater determinant ^E*. 
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Figure Legends 




Figure 1. Definition of the frame of reference fixed in the system. 
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Figure 2. Rotation associated to a change in a bond angle. Definition of the bond angle 9^, 
associated to atom e, and tfie unitary vector 9^ corresponding to tlie direction around which all atoms a 
with chains Ba containing e rotate if 0^ is varied while the rest of internal coordinates are kept constant. 



a 




b ^(^).© 




Figure 3. Rotation associated to a change in a dihedral angle. Definition of the dihedral angle 
(fie, associated to atom e. The positive sense of rotation is indicated in the figure, and we can 
distinguish between two situations regarding covalent connectivity: a) principal dihedral angle, and b) 
phase dihedral angle (see ref. [Echenique and Alonso(2006)]). 
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Figure 4. Special cases. Special cases of atoms that do not belong to the chain Ba connecting a to 
atom 1, but that are nevertheless used to position a. 




Figure 5. Molecules used in the numerical calculations in this section, (a) Methanol, (b) 
N-methyl-acetamide (abbreviated NMA), and (c) the tripeptide N-acetyl-glycyl-glycyl-glycyl-amide 
(abbreviated GLY3). Hydrogens are conventionally white, carbons are grey, nitrogens blue and oxygens 
red. The unconstrained dihedral angles that span the corresponding spaces /C are indicated with 
light-blue arrows, and some internal coordinates and some atoms that appear in the discussion are 
specifically labeled. The constrained dihedral angle (^22 is indicated by a red arrow in GLY3. 
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Figure 6. Derivatives of some selected coordinates of methanol. Derivatives of (a) the x 
coordinate of atom 5 in methanol, (b) the bond length 65 associated to it, (c) the bond angle ^5, and 
(d) the dihedral angle <y?5 as a function of the unconstrained coordinate tpe- Both the results of the new 
algorithm and those obtained by finite differences (FD) are depicted. The key for the different types of 
line is the same in the four graphs. 



Table 1. Stiffness of the constrained coordinates in methanol 



A(°) 


65 (A) 


^5 n 


n 


0.0 


1.090694 


109.403 


119.296 


0.01 


1.090694 


109.404 


119.296 


0.05 


1.090694 


109.404 


119.297 


0.1 


1.090694 


109.405 


119.299 


0.5 


1.090694 


109.409 


119.312 


1.0 


1.090694 


109.415 


119.329 


5.0 


1.090693 


109.462 


119.474 


10.0 


1.090692 


109.525 


119.671 



Values of the constrained coordinates associated to atom 5 of methanol for different displacements A in 

the unconstrained coordinate; ^^f^. The vahies correspond to the conformation with </?6 = 110°, and the 
number of significant figures presented is the default one provided by Gaussian 03. 
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Figure 7. Metastability of the local minima in GLY4. (a) Derivative dtp22/d'Pn of the 
constrained dihedral angle (^22) describing a peptide bond rotation in GLY3, with respect to the 

unconstrained coordinate (pn for a selected set of conformations in the working set. (b) 
Minimum-energy value of the constrained dihedral angle <^22 in the conformation 1044 of GLY3 for 
different values of the displacement A in the unconstrained coordinate 1^17. 
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Figure 8. Dependence of the error as a function of A. Average normalized error in the 
derivatives by finite differences as a function of A (see the text for a more precise definition), (a) Error 
averaged to all conformations and all atoms of the three molecular systems studied, (b) Error averaged 
to all conformations of the z-coordinate of three particular l^'^-row atoms in NMA. 
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Figure 9. Potential energy of the toy system in eq. (36). The range of x and y corresponds to 
the one explored in this work. Contour level lines and colour level indication in the surface have been 
added for visual comfort. All units are arbitrary. 



